Home:ALL Converter>Why memory-mapped files are always mapped at page boundaries?

Why memory-mapped files are always mapped at page boundaries?

Ask Time:2012-06-22T04:38:08         Author:favq

Json Formatter

This is my first question here; I'm not sure if it is off-topic.

While self-studying, I have found the following statement regarding Operating Systems:

Operating systems that allow memory-mapped files always require files to be mapped at page boundaries. For example, with 4-KB page, a file can be mapped in starting at virtual address 4096, but not starting at virtual address 5000.

This statement is explained in the following way:

If a file could be mapped into the middle of page, a single virtual page would need two partial pages on disk to map it. The first page, in particular, would be mapped onto a scratch page and also onto a file page. Handling a page fault for it would be a complex and expensive operation, requiring copying of data. Also, there would be no way to trap references to unused parts of pages. For these reasons, it is avoided.

I would like to ask for help to understand this answer. Particularly, what does it mean to say that "a single virtual page would need two partial pages on disk to map it"? From what I found about memory-mapped files, virtual pages are mapped to files on disk, and not to a paging file. Is this what is meant by "partial page"?

Also, what is meant by "scratch page" here? I've tried to look up this term on books (Tanenbaum's "Modern Operating Systems" and "Structured Computer Organization") and on the Web, but haven't found it.

Author:favq,eproduced under the CC 4.0 BY-SA copyright license with a link to the original source and this disclaimer.
Link to original article:https://stackoverflow.com/questions/11146352/why-memory-mapped-files-are-always-mapped-at-page-boundaries
Kirill Kobelev :

First of all, when reading books and documentation always try to look critically at what you see. Sometimes authors tend to use language like \"there is no other way\" just to promote the solution that they are describing. Other ways are always possible.\n\nNow to the matter. Modern operating systems always have a disk location for every allocated memory page. This makes sense. Once it will be necessary to discard the page in the memory - it is already clear where to put this page if it is 'dirty' or just discard it if it is not modified. This strategy is widely accepted. Although alternative policies are possible also.\n\nThe disk location can be either paging file or memory mapped file. The most common use of the memory mapped files - executables and dlls. They are (almost) never modified. If a page with the code is not used for some time - discard it. If control will come there - reread it from the file.\n\nIn the abstract that you mentioned, they say would need two partial pages on disk to map it. The first page, in particular, would be mapped onto a scratch page. They tend to present situation like there is only one solution here. In fact, it is possible to allocate page in a paging file for such combined page and handle appropriate data copying. It is also possible not to have anything in the paging file for such page and assemble this page from files using transient page. In 99% of cases disk controller can read/write only from/to the page boundary. This means that you need to read from the first file to memory page, from the second file to the transient page. Copy data from the transient page and immediately discard it.\n\nAs you see, it is perfectly possible to combine several files in one page. There is no principle problem here. Although algorithms for handling this solution will be more complex and they will consume more CPU clocks. Reconstructing such page (if it will be discarded) will require reading from several different files. In our days 4kb is rather small quantity. Saving 2kb is not a huge gain. In my opinion, looking at the benefits and the cost I would say that benefits are not significant enough.",
2012-06-21T21:30:11
yy